adversarial training data
- Asia > China > Jiangsu Province > Nanjing (0.04)
- North America > Canada (0.04)
- Asia > China > Beijing > Beijing (0.04)
Applications of Adversarial Training part3(Machine Learning)
Abstract: While adversarial training is generally used as a defense mechanism, recent works show that it can also act as a regularizer. By co-training a neural network on clean and adversarial inputs, it is possible to improve classification accuracy on the clean, non-adversarial inputs. We demonstrate that, contrary to previous findings, it is not necessary to separate batch statistics when co-training on clean and adversarial inputs, and that it is sufficient to use adapters with few domain-specific parameters for each type of input. We establish that using the classification token of a Vision Transformer (ViT) as an adapter is enough to match the classification performance of dual normalization layers, while using significantly less additional parameters. First, we improve upon the top-1 accuracy of a non-adversarially trained ViT-B16 model by 1.12% on ImageNet (reaching 83.76% top-1 accuracy).
Strength-Adaptive Adversarial Training
Yu, Chaojian, Zhou, Dawei, Shen, Li, Yu, Jun, Han, Bo, Gong, Mingming, Wang, Nannan, Liu, Tongliang
Adversarial training (AT) is proved to reliably improve network's robustness against adversarial data. However, current AT with a pre-specified perturbation budget has limitations in learning a robust network. Firstly, applying a pre-specified perturbation budget on networks of various model capacities will yield divergent degree of robustness disparity between natural and robust accuracies, which deviates from robust network's desideratum. Secondly, the attack strength of adversarial training data constrained by the pre-specified perturbation budget fails to upgrade as the growth of network robustness, which leads to robust overfitting and further degrades the adversarial robustness. To overcome these limitations, we propose \emph{Strength-Adaptive Adversarial Training} (SAAT). Specifically, the adversary employs an adversarial loss constraint to generate adversarial training data. Under this constraint, the perturbation budget will be adaptively adjusted according to the training state of adversarial data, which can effectively avoid robust overfitting. Besides, SAAT explicitly constrains the attack strength of training data through the adversarial loss, which manipulates model capacity scheduling during training, and thereby can flexibly control the degree of robustness disparity and adjust the tradeoff between natural accuracy and robustness. Extensive experiments show that our proposal boosts the robustness of adversarial training.
Deep Adversarially-Enhanced k-Nearest Neighbors
Recent works have theoretically and empirically shown that deep neural networks (DNNs) have an inherent vulnerability to small perturbations. Applying the Deep k-Nearest Neighbors (DkNN) classifier, we observe a dramatically increasing robustness-accuracy trade-off as the layer goes deeper. In this work, we propose a Deep Adversarially-Enhanced k-Nearest Neighbors (DAEkNN) method which achieves higher robustness than DkNN and mitigates the robustness-accuracy trade-off in deep layers through two key elements. First, DAEkNN is based on an adversarially trained model. Second, DAEkNN makes predictions by leveraging a weighted combination of benign and adversarial training data. Empirically, we find that DAEkNN improves both the robustness and the robustness-accuracy trade-off on MNIST and CIFAR-10 datasets.
- North America > United States > Michigan (0.05)
- Asia > Middle East > Jordan (0.04)
Learning to Confuse: Generating Training Time Adversarial Data with Auto-Encoder
Feng, Ji, Cai, Qi-Zhi, Zhou, Zhi-Hua
In this work, we consider one challenging training time attack by modifying training data with bounded perturbation, hoping to manipulate the behavior (both targeted or non-targeted) of any corresponding trained classifier during test time when facing clean samples. To achieve this, we proposed to use an auto-encoder-like network to generate the pertubation on the training data paired with one differentiable system acting as the imaginary victim classifier. The perturbation generator will learn to update its weights by watching the training procedure of the imaginary classifier in order to produce the most harmful and imperceivable noise which in turn will lead the lowest generalization power for the victim classifier. This can be formulated into a non-linear equality constrained optimization problem. Unlike GANs, solving such problem is computationally challenging, we then proposed a simple yet effective procedure to decouple the alternating updates for the two networks for stability. The method proposed in this paper can be easily extended to the label specific setting where the attacker can manipulate the predictions of the victim classifiers according to some predefined rules rather than only making wrong predictions. Experiments on various datasets including CIFAR-10 and a reduced version of ImageNet confirmed the effectiveness of the proposed method and empirical results showed that, such bounded perturbation have good transferability regardless of which classifier the victim is actually using on image data.